(6-5) FORECASTING w TREND (Deterministic)

Nao Mimoto - Dept. of Statistics : The University of Akron

TS Class Web PageR resource page




1. Forecasting Lake HURON with ARMA (Direct fit)



  • Recall that Lake Huron data was fit three different way (c.f. Lec 14), each yielding adequate model.

If we treat it as “stationary” series, we can try to fit ARMA(p,q) series.

## [1] 97

## Warning in kpss.test(A): p-value smaller than printed p-value
##        KPSS  ADF    PP
## p-val: 0.01 0.24 0.025
## Series: Lake 
## ARIMA(1,0,1) with non-zero mean 
## 
## Coefficients:
##          ar1     ma1    mean
##       0.7665  0.3393  9.1290
## s.e.  0.0773  0.1123  0.3861
## 
## sigma^2 estimated as 0.4784:  log likelihood=-101.09
## AIC=210.18   AICc=210.62   BIC=220.48

##   B-L test H0: the sereis is uncorrelated
##   M-L test H0: the square of the sereis is uncorrelated
##   J-B test H0: the sereis came from Normal distribution
##   SD         : Standard Deviation of the series
##       BL15  BL20  BL25  ML15  ML20   JB    SD
## [1,] 0.963 0.952 0.934 0.567 0.641 0.89 0.684

Analysis 1 (direct fit)

  • auto.arima() chooses AR(2) with min AICC.

  • AR(2) with constant mean was fit directly to data. With observation \(Y_t\),
    \[ Y_t \hspace{3mm} = \hspace{3mm} \mu + X_t \\\\ X_t \hspace{3mm} = \hspace{3mm} \phi_1 X_{t-1} + \phi_2 X_{t-2} + e_t \]

##      Pred.rMSE    95%PI        Mean
## [1,] 0.8029921 1.526601 -0.07798477








2. Foreasting with ARMA with Linear Trend

If we choose that the level is going down linearly, we can try to fit a line with ARMA errors.

## 
## Call:
## lm(formula = Lake ~ time(Lake))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.50919 -0.74760 -0.01556  0.75966  2.53409 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 55.278141   7.922614   6.977 4.02e-10 ***
## time(Lake)  -0.024071   0.004119  -5.843 7.16e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.136 on 95 degrees of freedom
## Multiple R-squared:  0.2644, Adjusted R-squared:  0.2566 
## F-statistic: 34.14 on 1 and 95 DF,  p-value: 7.165e-08

## Series: Reg2$residuals 
## ARIMA(1,0,1) with zero mean 
## 
## Coefficients:
##          ar1     ma1
##       0.6671  0.3827
## s.e.  0.0937  0.1135
## 
## sigma^2 estimated as 0.452:  log likelihood=-98.72
## AIC=203.44   AICc=203.7   BIC=211.16

##   B-L test H0: the sereis is uncorrelated
##   M-L test H0: the square of the sereis is uncorrelated
##   J-B test H0: the sereis came from Normal distribution
##   SD         : Standard Deviation of the series
##       BL15  BL20  BL25 ML15  ML20    JB    SD
## [1,] 0.985 0.974 0.969 0.19 0.189 0.782 0.669
## Series: Lake 
## Regression with ARIMA(1,0,1) errors 
## 
## Coefficients:
##          ar1     ma1  intercept     xreg
##       0.6682  0.3817    55.5443  -0.0242
## s.e.  0.0936  0.1136    18.0324   0.0094
## 
## sigma^2 estimated as 0.4612:  log likelihood=-98.66
## AIC=207.33   AICc=207.98   BIC=220.2

Analysis 2 (linear trend)

  • Linear trend was fit to \(Y_t\), then the residuals were fitted with ARMA(1,1).

  • Y_t = a+b t + X_t X_t

  • In other words, \[ Y_t \hspace{3mm} = \hspace{3mm} a + bt + X_t \\\\ X_t \hspace{3mm} = \hspace{3mm} \phi_1 X_{t-1} + e_t + \theta_1 e_{t-1} \]

##      Pred.rMSE    95%PI      Mean
## [1,] 0.7817429 1.497827 0.2225482





3. Forecasting with ARIMA (Random Trend)

Let’s look at the difference between the observations: \[ diff(Y) = Y_t - Y_{t-1} \\ \\ \]

## Warning in adf.test(A): p-value smaller than printed p-value
## Warning in pp.test(A): p-value smaller than printed p-value
## Warning in kpss.test(A): p-value greater than printed p-value
##        KPSS  ADF   PP
## p-val:  0.1 0.01 0.01
## Series: Lake 
## ARIMA(1,1,2) 
## 
## Coefficients:
##          ar1      ma1      ma2
##       0.6385  -0.5349  -0.3514
## s.e.  0.1345   0.1445   0.1055
## 
## sigma^2 estimated as 0.4812:  log likelihood=-99.88
## AIC=207.76   AICc=208.2   BIC=218.02

##   B-L test H0: the sereis is uncorrelated
##   M-L test H0: the square of the sereis is uncorrelated
##   J-B test H0: the sereis came from Normal distribution
##   SD         : Standard Deviation of the series
##       BL15  BL20  BL25  ML15  ML20    JB    SD
## [1,] 0.979 0.956 0.942 0.524 0.495 0.577 0.677

Analysis 3 (take difference)

  • Difference of \(Y_t\) was taken. The difference seems to be WN.

  • \[ \bigtriangledown Y_t \mbox{ is WN} \\ Y_t \mbox{ is ARIMA(0,1,0)} \]

##      Pred.rMSE    95%PI       Mean
## [1,] 0.7427918 1.498318 0.09050329





Comparing MSE

  • From Model1, 2, 3, estimate for \(\sigma^2\) was \(0.4812, \, 0.4612, \, 0.4784\) respectively.

  • Taking square root, that means \(\hat \sigma\) are \(0.6936, \, 0.6791, \, 0.6916\)

  • On the other hand, rolling 1-step prediction rMSE for the three models were \(0.8029, \, 0.7817, \, 0.7427\)

  • Why do they not agree? Which model is fitting the best?

  • What is over-fittting?